Mixed-effects modeling with crossed random effects for subjects and items
نویسندگان
چکیده
This paper provides a non-technical introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixedeffects analyses compared to traditional analyses based on quasi-F tests, by-subjects analyses, combined by-subjects and by-items analyses, and random regression. Applications and possibilities across a range of domains of inquiry are discussed. 1 General Introduction Psycholinguists and other cognitive psychologists use convenience samples for their experiments, often based on participants within the local university community. When analyzing the data from these experiments, participants are treated as random variables, because the interest of most studies is not about ∗ Corresponding author. Address: Max Planck Institute for Psycholinguistics, P.O. Box 310, 6500 AH, Nijmegen, The Netherlands Email addresses: [email protected] (R.H. Baayen), [email protected] (D.J. Davidson), [email protected] (D.M. Bates). Preprint submitted to Elsevier Februari 3, 2006 experimental effects present only in the individuals who participated in the experiment, but rather in effects present in speakers everywhere, either within the language studied, or human language users in general. The differences between individuals due to genetic, developmental, environmental, social, political, or chance factors are modeled jointly by means of a participant random effect. A similar logic applies to linguistic materials. Psycholinguists construct materials for the tasks that they employ by a variety of means, but most importantly, most materials in a single experiment do not exhaust all possible syllables, words, or sentences that could be found in a given language, and most choices of language to investigate do not exhaust the possible languages that an experimenter could investigate. In fact, two core principles of the structure of language, the arbitrary (and hence statistical) association between sound and meaning (de Saussure) and the unbounded combination of finite lexical items (von Humboldt), guarantee that a great many language materials must be a sample, rather than an exhaustive list. The space of possible words, and the space of possible sentences, is simply too large to be modeled by any other means. Just as we model human partipants as random variables, we have to model factors characterizing their speech as random variables as well. Clark (1973) illuminated this issue, sparked by the work of Coleman (1964), by showing how language researchers might generalize their results to the larger population of linguistic materials from which they sample by testing for statistical significance of experimental contrasts with participants and items analyses. Clark’s oft-cited paper presented a technical solution to this modeling problem, based on statistical theory and computational methods available at the time (e.g., Winer, 1971). This solution involved computing a quasi-F statistic which, in the simplest-to-use form, could be approximated by the use of a combined minimum-F statistic derived from separate participants (F1) and items (F2) analyses. In the 30+ years since, statistical techniques have expanded the space of possible solutions to this problem, but these techniques have not yet been applied widely in the field of language and memory studies. The present paper offers a new alternative known as a mixed effects model approach, based on maximum likelihood methods that are now in common use in many areas of science, medicine, and engineering. More specifically, we introduce a very recent development in computational statistics, namely, the possibility to include subjects and items as crossed random effects, as opposed to hierarchical or multilevel models in which random effects must be assumed to be nested. Traditional approaches to random effects modeling suffer multiple drawbacks which can be eliminated by adopting mixed effect linear models. These drawbacks include (a) deficiencies in statistical power related to the problems posed by repeated observations, (b) the lack of a flexible method of dealing with
منابع مشابه
Examples of mixed-effects modeling 1 Running head: EXAMPLES OF MIXED-EFFECTS MODELING Examples of mixed-effects modeling with crossed random effects and with binomial data
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against these crossed mixed-effects models, for simulated ...
متن کاملMixed effects models in neurolinguistics: is there a way beyond univariate analysis?
Mixed effects models have recently gained some attention in psychoand neurolinguistics applied research, even though their adoption is still not extensive enough (Meyers and Beretvas, 2006). In particular, psychoandneurolinguistics data have a specific structure where subjects and items are reciprocally nested. Each subject is crossed within each item, since every person is stud...
متن کاملExamples of mixed-effects modeling with crossed random effects and with binomial data
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against these crossed mixed-effects models, for simulated ...
متن کاملOn the use of multilevel modeling as an alternative to items analysis in psycholinguistic research.
The use of multilevel modeling is presented as an alternative to separate item and subject ANOVAs (F1 x F2) in psycholinguistic research. Multilevel modeling is commonly utilized to model variability arising from the nesting of lower level observations within higher level units (e.g., students within schools, repeated measures within individuals). However, multilevel models can also be used whe...
متن کاملBeta - Binomial and Ordinal Joint Model with Random Effects for Analyzing Mixed Longitudinal Responses
The analysis of discrete mixed responses is an important statistical issue in various sciences. Ordinal and overdispersed binomial variables are discrete. Overdispersed binomial data are a sum of correlated Bernoulli experiments with equal success probabilities. In this paper, a joint model with random effects is proposed for analyzing mixed overdispersed binomial and ordinal longitudinal respo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007